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ABSTRACT 

The  Fitts  correlation  algorithm  has  been  widely  used  for  over  forty  years  in  high  speed  video  trackers.  It 
has  the  advantage  that  it  is  very  simply  implemented  in  a  digital  computer  with  a  small  number  of 
calculations.  At  each  step  the  algorithm  attempts  to  estimate  the  shift  between  an  image  of  a  moving  target 
and  a  proto-type  image.  There  are  several  well-known  short  comings  of  the  Fitts  algorithm.  First  the  error 
in  the  shift  estimate  increases  if  the  shift  is  greater  than  one  pixel  of  the  digital  image.  Second  the  Fitts 
algorithm  is  susceptible  to  errors  from  sensor  noise  if  the  video  images  have  low  signal  to  noise  ratio. 
These  errors  can  force  a  lower  tracker  closed  loop  bandwidth  to  maintain  track  loop  stability.  An 
alternative  correlation  tracker  algorithm  is  known  as  Projection  Based  Phase  Only  Correlation.  In  this 
paper  we  compare  the  two  algorithms  with  respect  to  the  effect  of  sensor  noise. 

Keywords:  Fitts  Algorithm,  Fast  Cross-Correlation,  Phase  Only  Matched  Filter,  LADAR 

1.0  INTRODUCTION 

Correlation  based  shift  estimation  algorithms  are  often  used  in  tracking  systems  to  estimate  the  change  in 
position  of  an  object  in  an  image  frame  and  a  reference  image.  The  image  shift  is  estimated  from  the  peak 
of  the  cross-correlation.  The  main  issue  with  such  an  approach  is  that  the  cross-correlation  operation  is 
computationally  intensive.  Most  tracking  algorithms  run  at  a  high  frame  rate  and  the  shift  estimation 
algorithm  is  required  to  operate  in  real  time. 

In  addition  a  reference  image  is  usually  unknown  when  the  algorithm  is  initiated  or  it  may  be  changing 
with  time  and  so  must  be  estimated  on  the  fly.  The  reference  image  must  often  be  estimated  via  some  sort 
of  maximum  a-postiori  approach,  for  this  reason  it  is  often  referred  to  as  the  MAP  image.  This  is  usually 
done  with  a  straight  forward  recursive  averaging  algorithm. 

One  of  the  most  widely  used  fast  correlation  algorithm  is  known  as  the  Fitts1  algorithm.  In  use  since  the 
70’s  it  is  simple  and  fast.  A  second  fast  algorithm  is  the  projection-based,  which  reduces  a  2-D  cross¬ 
correlation  into  two  1-D  cross-correlations.  In  this  paper  we  introduce  a  projection-based  phase  only 
(PBPO)  cross-correlation  algorithm,  which  is  a  hybrid  of  the  phase  only  matched  filter  and  projection- 
based  cross-correlation.  The  purpose  of  this  paper  is  to  make  some  comparisons  between  the  Fitts  and 
PBPO  algorithms. 


2.0  FITTS  ALGORITHM 


The  Fitts  algorithm1  has  been  widely  used  for  correlation  trackers  over  forty  years,  primarily  due  to  its 
simplicity  and  corresponding  speed.  Consider  Taylor  series  expansion  of  a  proto-type  image  w  shifted  by 
some  amount  8, 


w(x-Sx)  =  w(x)~ 


dw(x) 

dx 


8  +  higher  order  terms  . 


Note  that  w  is  a  two  dimensional  image,  but  for  brevity  we  only  use  a  single  index  x. 
Let  our  measurement  be 


(1) 
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d (x)  =  w(x-S)  «  w(x)  -  •  8  , 

d(x) 


(2) 


where  we  keep  only  the  linear  term  of  the  expansion.  Then 

S  =  —^{d(x)-w(x)),  (3) 

d(x) 

for  all  x  over  the  image.  This  results  in  a  system  of  equations  from  which  we  can  get  a  least  squares 
estimate  of  the  shift  5.  Fitts1  puts  this  into  the  form  of  a  least  squares  matched  filter, 

J  =  —  \W(x)-{d(x)-w(x))dx ,  (4) 

c  . 

image 

where 


c=  J 
image 


dw(x) 

dx 


dx , 


(5) 


and 


W(x)  =  - 


dw(x ) 
dx 


(6) 


where  g  represents  measurement  errors  and  any  aspect  change  of  the  target. 


To  implement  equation  (4)  we  must  have  an  estimate  w(x)  of  the  image  of  the  object  we  are  trying  to  track. 
The  usual  approach  is  to  attempt  to  obtain  a  Maximum  a-posteriori  estimate  of  the  object,  or  MAP  for 
short,  from  a  recursive  average  of  past  measurements. 


The  derivatives  in  equations  (5)  and  (6)  must  be  estimated  from  the  pixilated  image  data". 

dw(x,  y)  ~  w(x  - 1,  y)  -  w(x  + 1,  y) 
dx  2 

dw(x,  y)  '  w(x,  y- 1)  -  w(x,  y  + 1) 
dy  2 


(7) 

(8) 


Sub-pixel  shifts  can  be  obtained  directly  from  the  Fitts  algorithm  without  any  special  added  processing, 
however  the  algorithm  starts  to  break  down  with  shifts  greater  than  one  pixel3'4'5. 

3.0  PROJECTION-BASED  PHASE  ONLY  ALGORITHM 

Image  shift  is  often  estimated  via  a  cross-correlation  based  approach.  The  idea  is  to  estimate  the  shift  of 
the  object  by  the  peak  of  the  cross-correlation  between  the  image  measurement  and  the  MAP  estimate  of 
the  object.  The  cross-correlation  between  a  measurement  image  y(x)  and  the  MAP  estimate  w(x)  can  be 
given  by  : 


2 
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rw,  y  (z)  =  |  w(x)  ■  d(z  +  x)dx  . 


(9) 


The  normalized  cross-correlation  estimate  of  the  shift  is  then  given  by: 

s  rw,d(z) 

b  =  arg  max  z  — 

“  <Jrw,w(z>>'rd,d(z) 

where  the  denominator  mitigates  effects  of  the  measurement  and  MAP  image  shape  on  the  cross¬ 
correlation. 


(10) 


It  is  computationally  more  efficient  to  calculate  the  cross-correlation  in  the  frequency  domain  using  fast 
Fourier  transforms. 


;</(*)  =  3  1(3(r/(x))-3(wM)S' 


(11) 


Unfortunately  there  is  no  direct  Fourier  transform  analogue  to  efficiently  calculate  the  normalized  cross¬ 
correlation  shown  in  equation  (11).  There  is  an  approximate  relationship  however  known  as  the  “phase 
only  matched  filter”6,7.  In  the  Fourier  domain 


3(c/(x))-  3(w(x)) 
|3(r/(x)|-||3(w(x)| 


In  equation  (12)  we  use  the  Fourier  amplitudes  to  normalize  for  product  of  the  complex  Fourier  transforms, 
essentially  keeping  only  the  phase  information  of  the  numerator. 


Sx,y  =  arg  max  z 


x,y 


-1 


3{d(x,  y))'3(n(rj)f 
|3(£/(.T,y)||-|3(w<x,y)| 


(13) 


Homer8  has  shown  that  equation  (11)  produces  a  much  more  peaked  result  on  which  to  perform  shift 
estimation  by  finding  a  maximum  than  equation  (8). 

In  equation  (7)-(  11)  d  and  w  represent  two  dimensional  images.  We  carried  only  one  index  variable  just  to 
make  the  equations  simpler.  The  Fourier  transforms  shown  were  two  -dimensional. 

The  number  of  computations  needed  to  produce  the  shift  estimate  shown  in  equation  (11)  can  be 
significantly  reduced  by  using  a  projection  based  algorithm.  The  projection  based  algorithm  replaces  the 
two-dimensional  Fourier  transform  by  two  one-dimensional  transforms,  greatly  reducing  the  total  number 
of  calculations.  . 
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Fig.  1  Illustration  of  the  row  and  column  projections  for  the  2-D  projection  algorithm. 


The  projection  concept  is  illustrated  in  Fig.  1.  Two  projections  are  formed  by  summing  along  the  rows  of 
the  image  and  then  summing  along  the  columns  resulting  to  two  1  dimensional  arrays.  Shifts  in  an  image 
measurement  with  respect  to  a  reference  image  can  then  be  estimated  by  the  peaks  in  the  1-D  cross¬ 
correlation.  Cain4  has  shown  that  the  location  of  the  two  1-D  cross-correlation  peaks  is  the  same  as  the 
location  of  the  2-D  cross-correlation  peak. 


The  PBPO  cross-correlation  algorithm  uses  only  the  phase  to  perform  the  1-D  Fourier  domain  cross¬ 
correlations.  It  is  an  approximation  to  the  normalized  cross-correlation  approach  that  has  been  widely  used. 


8X  =  arg  max  z  3 


Sy  =argmaxZi  3 


_1  3(fi?(x))- 3(w(x)) 
||3(</(x))||-||3(w(x)| 

-1  3(t/(y))-3(vv(y))* 


||3(t/(j))|M|3(w(y))| 


(14) 


(15) 


4.0  ALGORITHM  COMPARISONS 

The  beauty  of  the  Fitts  algorithm  is  its  simplicity  and  speed.  The  motivation  for  looking  at  the  PBPO 
algorithm  is  to  achieve  a  fast  shift  estimation  algorithm  that  can  be  run  in  real  time  while  overcoming  some 
of  the  shortcomings  of  the  Fitts  algorithm10.  It  is  well  known  that  the  Fitts  algorithm’s  shift  estimation 
error  increases  for  shifts  greater  than  one  pixel2.  The  Fitts  algorithm  works  best  with  a  high  contrast 
objects  against  a  smooth  background.  If  the  MAP  image  contains  a  lot  of  high  spatial  frequency 
background  clutter,  the  differencing  operation  shown  in  eqns.  (7)  &  (8)  can  lead  to  additional  errors.  The 
projection  operation  in  the  PBPO  algorithm  actually  helps  to  smooth  out  high  frequency  noise  and 
background  clutter  in  both  the  MAP  image  and  the  measurement  image. 
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a)  Fitts  Algorithm  b)  PBPO  Algorithm 


Fig.  2  Comparison  of  Shift  Error  Between  Fitts  Algorithm  and  PBPO. 

Figure  2  shows  a  comparison  of  the  shift  errors  in  the  two  algorithms  as  a  function  of  the  applied  shift 
input.  It  can  be  seen  that  Fitts  error  increases  when  the  shift  is  greater  than  one  pixel,  while  the  PBPO  shift 
estimate  remains  linear  out  to  many  pixels.  The  stair  step  error  in  the  PBPO  shift  estimator  (Fig  2-b)  is  due 
to  the  pixilation  in  the  1-D  discrete  Fourier  transforms  used  in  its  calculation. 

Fitts  Mults  PBPO  Mults 

N2  4Nlog2N  +  8N 

Table  1  Number  of  multiplication  operations  forNxN  images. 

In  evaluating  the  performance  of  an  algorithm,  the  most  time  consuming  operation  is  multiplication,  and  so 
we  consider  the  number  of  multiplication  operations  required  to  implement  the  cross-correlations.  These 
are  shown  in  Table  1.  It  is  assumed  that  the  measurement  and  the  MAP  are  NxN  images.  The  Fitts 
algorithm  requires  N2  multiplications.  Since  the  PBPO  uses  1-D  Fourier  transforms  of  the  projections,  the 
2  Fourier  transforms  each  require  Nlog2N  multiplications  forward  and  backward.  There  are  then  2N 
complex  multiplications  each  requiring  4  real  multiplications.  Table  1  indicates  that  if  N  is  larger  than  32, 
the  PBPO  algorithm  will  require  fewer  multiplications  than  the  Fitts. 


5.0  NOISE 

The  derivations  of  the  Fitts  algorithm  in  Section  2.0  ignored  error  in  the  measurements  d(x).  In  fact 
d'(x)  =  d(x)  +  n(x)  (16) 

Where  n(x)  is  the  measurement  noise  and  is  assumed  to  be  uncorrelated  Gaussian. 

Inserting  equation  (16)  into  equation  (4)  gives  the  noise  associated  with  the  shift  estimate. 

S'=S  +  —  { W(x)-(n(x))dx  (17) 

c  . 

mage 

The  second  term  in  equation  (17)  is  the  noise  term.  This  term  is  the  correlation  of  the  noise  with  the 
gradient  of  the  MAP  image  W(x). 

The  projection  based  PBPO  algorithm,  as  outlined  in  Section  3.0,  first  calculates  the  two  projections  of  the 
measurements  by  summing  over  the  rows  and  columns.  The  1-D  correlations  are  then  formed  with  the  two 
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projections  of  the  MAP  image.  In  this  case  the  noise  term  can  be  given  by  the  1-D  correlation  between  the 
projections  and  the  projection  of  the  noise. 

S'=  S  +  w(x)  ■'£ln(x)dx  (18) 

y  y 

The  projection  operation  acts  as  a  smoothing  filter.  Thus  in  most  cases  we  would  expect  the  second  term  of 
(18)  to  be  less  than  the  second  term  of  (17). 


6.0  CONCLUSIONS 

High  speed  video  trackers  rely  on  fast  correlation  algorithms.  Direct  correlation  calculations  are  far  too 
computationally  intensive  to  operate  in  real  time.  The  Fitts  correlation  algorithm  is  widely  used  because  it 
is  simple  and  straight  forward  to  implement  and  can  produce  sub-pixel  shift  estimates  directly  from  the 
calculations  without  the  necessity  of  Fourier  transforms.  It  starts  off  with  an  optimal  matched  filter.  An 
approximation  to  the  matched  filter  weights  is  made  using  only  the  first  term  of  a  Taylor  series  expansion. 
Because  only  the  first  term  in  the  expansion  is  used,  shift  estimate  error  increases  when  the  shift  is  greater 
than  one  pixel. 

The  PBPO  algorithm  is  conceptually  more  complicated,  however  if  it  is  carefully  implements  there  are 
actually  fewer  multiplication  operations.  It  is  a  hybrid  approach  which  combines  the  projection  based  cross¬ 
correlation  algorithm  with  phase  only  matched  filtering.  It  works  well  with  multi-pixel  shifts  and  because 
the  projections  are  a  smoothing  operation,  it  works  well  with  high  frequency  noise  and  clutter  in  the 
background. 

The  PBPO  algorithm  uses  projections  of  the  measured  data  in  the  x  and  y  directions.  Since  the  projection 
operation  acts  as  a  low  pass  filter,  we  expect  that  the  noise  associated  with  this  algorithm  to  be  less  than 
that  associated  with  the  Fitts  algorithm. 
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